Repeat-aware Comparative Genome Assembly
نویسندگان
چکیده
The current high-throughput sequencing technologies produce gigabytes of data even when prokaryotic genomes are processed. In a subsequent assembly phase, the generated overlapping reads are merged, ideally into one contiguous sequence. Often, however, the assembly results in a set of contigs which need to be stitched together with additional lab work. One of the reasons why the assembly produces several distinct contigs are repetitive elements in the newly sequenced genome. While knowing order and orientation of a set of non-repetitive contigs helps to close the gaps between them, special care has to be taken for repetitive contigs. Here we propose an algorithm that orders a set of contigs with respect to a related reference genome while treating the repetitive contigs in an appropriate way.
منابع مشابه
Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species
Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...
متن کاملMsh1 Influence on Plant Mitochondrial Genome Recombination and Phenotype in Tobacco
Recombination activity plays an important role in the heteroplasmic and stoichiometric variation of plant mitochondrial genomes. Recent studies show that the nuclear gene MSH1 functions to suppress asymmetric recombination at 47 repeat pairs within the Arabidopsis mitochondrial genome. Two additional nuclear genes, RECA3 and OSB1, have also been shown to participate in the control of mitochondr...
متن کاملChromosomer: a reference-based genome arrangement tool for producing draft chromosome sequences
BACKGROUND As the number of sequenced genomes rapidly increases, chromosome assembly is becoming an even more crucial step of any genome study. Since de novo chromosome assemblies are confounded by repeat-mediated artifacts, reference-assisted assemblies that use comparative inference have become widely used, prompting the development of several reference-assisted assembly programs for prokaryo...
متن کاملEvolutionary and comparative analyses of the soybean genome
The soybean genome assembly has been available since the end of 2008. Significant features of the genome include large, gene-poor, repeat-dense pericentromeric regions, spanning roughly 57% of the genome sequence; a relatively large genome size of ~1.15 billion bases; remnants of a genome duplication that occurred ~13 million years ago (Mya); and fainter remnants of older polyploidies that occu...
متن کاملThe draft genome assembly of Rhododendron delavayi Franch. var. delavayi
Rhododendron delavayi Franch. is globally famous as an ornamental plant. Its distribution in southwest China covers several different habitats and environments. However, not much research had been conducted on Rhododendron spp. at the molecular level, which hinders understanding of its evolution, speciation, and synthesis of secondary metabolites, as well as its wide adaptability to different e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010